-
Notifications
You must be signed in to change notification settings - Fork 13.4k
rustc_const_eval: Expose APIs for signalling foreign accesses to memory #141391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
pub fn apply_accesses( | ||
&mut self, | ||
mut ids: Vec<AllocId>, | ||
reads: Vec<std::ops::Range<u64>>, | ||
writes: Vec<std::ops::Range<u64>>, | ||
) -> InterpResult<'tcx> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand what this function is supposed to do, but it seems to be doing the work of finding the allocation that needs to be adjusted? That logic can entirely live inside Miri; Miri is in control of picking the absolute addresses for all memory so it can do this easily. In fact it can do it more efficiently since it has a list of all allocations sorted by their absolute address.
I think the only change you need inside rustc is a version of prepare_for_native_write
that takes a range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right, I recall running into some roadblock trying to do this within Miri but I don't see it now when looking over it again so it might be fine to move this as you suggested. Ty!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I tried to move it and realised - the culprit is get_alloc_raw()
which is private and wholly inside of rustc. As far as I can tell, it being private is also why prepare_for_native_call()
was inside rustc and not Miri
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah and get_alloc_raw
really should stay private or else we'll have more bugs like #142575...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, we could still expose something that does get_alloc_raw
+prepare_for_native_write
.
/// Initialise previously uninitialised bytes in the given range, and set provenance of | ||
/// everything in it to `Wildcard`. Before calling this, make sure all provenance in this | ||
/// range is exposed! | ||
pub fn mark_foreign_write(&mut self, range: AllocRange) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to skip resetting unit memory to 0. That step must not be skipped!
We shouldn't need two operations here anyway. Just one operation, prepare_for_native_write
with a range, that does all the things it used to do, but restricted to a range.
/// | ||
/// The allocations in `ids` are assumed to be already exposed. | ||
pub fn prepare_for_native_call(&mut self, ids: Vec<AllocId>) -> InterpResult<'tcx> { | ||
pub fn prepare_for_native_call( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if paranoid
is false
, then this function does what exactly? It seems to do just absolutely nothing.^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This zeroes out the bytes of uninitialised memory without actually marking it as init. I initially didn't do this, but it resulted in mark_foreign_write
overwriting the data we cared about with zeroes. That's also why the latter doesn't zero out anything. We first zero the memory, then call the foreign code, then without re-zeroing mark it as init if it was written to after being zeroed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does it do that zeroing? prepare_for_native_write
only gets called if paranoid
is true. So what you say and what the code does do not seem to line up, or I am misunderstanding something.
But also, I think we shouldn't have such a 2-stage approach. This seems easier to reason about if we just fully delay everything until the memory gets accessed the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I explained myself poorly, sorry. Calling get_alloc_raw()
has a side effect, namely (by calling get_global_alloc()
which calls adjust_global_allocation()
which calls init_allocation()
) it, well, actually initialises that allocation. The point is that if we skip this, calling get_global_alloc()
post-FFI might "initialise" allocations that were actually written to by the foreign code that Miri doesn't know about. I confirmed this with testing; if we allocate a pointer and pass it across the FFI boundary but skip calling prepare_for_native_call(false)
, the data written will be replaced with zeroes as soon as get_global_alloc()
is called after the FFI call has completed. So, as far as I can tell init_allocation()
is responsible for said zeroing the first time it's called for a specific allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no zeroing anywhere outside prepare_for_native_write
so I don't know what you are talking about.
adjust_global_allocation
will set up the memory of the static with whatever the initial value of the static is. Is that what you mean? That's not "zeroing" though unless the initial value of the static happens to be zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point that we need to actually initialize the globals at some point. Why can't we do this lazily on first access, like the other FFI adjustments are done with your approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust_global_allocation will set up the memory of the static with whatever the initial value of the static is
Ah, that would explain why only tests that had to do with statics had trouble here. Every time I ran into that problem the memory was all zeroes, so I incorrectly assumed that's what it was doing I guess. I should probably update this then, apologies for my confusion
It's a good point that we need to actually initialize the globals at some point. Why can't we do this lazily on first access, like the other FFI adjustments are done with your approach?
I'm not sure how to do that I guess, since if the op was a read instead of a write we need it to already have been initialised or else the foreign code will read uninitialised data, no? We only know an access happened 1 instruction after it did happen so I think we still have to be cautious here and initialise the globals. I might be missing something though
@rustbot author |
Reminder, once the PR becomes ready for a review, use |
This PR will allow Miri to internally update its state based on information about foreign accesses performed on its memory during FFI. Necessary as part of rust-lang/miri#4326 to make use of the extra information we gain; currently pending review of the design (see design document in the linked PR), so marked as draft for now.
r? @RalfJung